{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright (c) Microsoft Corporation. All rights reserved.\n",
    "\n",
    "Licensed under the MIT License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## AKS Load Testing\n",
    "\n",
    "Once a model has been deployed to production it is important to ensure that the deployment target can support the expected load (number of users and expected response speed). This is critical for providing recommendations in production systems that must support recommendations for multiple users simultaneously. As the number of concurrent users grows the load on the recommendation system can increase significantly, so understanding the limits of any operationalized system is necessary to avoid unwanted system failures or slow response times for users. \n",
    "\n",
    "To perform this kind of load test we can leverage tools that simulate user requests at varying rates and establish how many requests per seconds, or what the average response time is for the service. This notebook walks through the process of performing load testing for a deployed model on Azure Kubernetes Service (AKS).\n",
    "\n",
    "This notebook assumes an AKS Webservice was used to deploy the model from a Azure Machine Learning service Workspace.\n",
    "An example of this approach is provided in the [LightGBM Operationalization notebook](lightgbm_criteo_o16n.ipynb).\n",
    "\n",
    "We use [Locust](https://docs.locust.io/en/stable/) to perform the load testing, see documentation for more details about this tool."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Azure ML SDK version: 1.0.18\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import subprocess\n",
    "import sys\n",
    "from tempfile import TemporaryDirectory\n",
    "from urllib.parse import urlparse\n",
    "\n",
    "sys.path.append('../..')\n",
    "\n",
    "import requests\n",
    "\n",
    "from azureml.core import Workspace\n",
    "from azureml.core import VERSION as azureml_version\n",
    "from azureml.core.webservice import AksWebservice\n",
    "\n",
    "from reco_utils.dataset.criteo import get_spark_schema, load_pandas_df\n",
    "from reco_utils.azureml.azureml_utils import get_or_create_workspace\n",
    "\n",
    "# Check core SDK version number\n",
    "print(\"Azure ML SDK version: {}\".format(azureml_version))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>.container { width:100% !important; }</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# We increase the cell width to capture all the output from locust later\n",
    "from IPython.core.display import display, HTML\n",
    "display(HTML(\"<style>.container { width:100% !important; }</style>\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create a temporary directory for generated files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "TMP_DIR = TemporaryDirectory()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Retrieve the AKS service information"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# this must match the service name that has been deployed\n",
    "SERVICE_NAME = 'lightgbm-criteo'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Warning: Falling back to use azure cli login credentials.\n",
      "If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.\n",
      "Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Found the config file in: C:\\Users\\scgraham\\repos\\Recommenders\\notebooks\\05_operationalize\\aml_config\\config.json\n",
      "Wrote the config file config.json to: C:\\Users\\scgraham\\repos\\Recommenders\\notebooks\\05_operationalize\\aml_config\\config.json\n"
     ]
    }
   ],
   "source": [
    "ws = get_or_create_workspace()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "aks_service = AksWebservice(ws, name=SERVICE_NAME)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get the scoring the URI\n",
    "url = aks_service.scoring_uri\n",
    "parsed_url = urlparse(url)\n",
    "\n",
    "# Setup authentication using one of the keys from aks_service\n",
    "headers = dict(Authorization='Bearer {}'.format(aks_service.get_keys()[0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Get Sample data for testing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "8.79MB [00:04, 1.93MB/s]                                                                                                                                                                                                                                                   \n"
     ]
    }
   ],
   "source": [
    "# Grab some sample data\n",
    "df = load_pandas_df(size='sample')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\"label\":0,\"int00\":1.0,\"int01\":1,\"int02\":5.0,\"int03\":0.0,\"int04\":1382.0,\"int05\":4.0,\"int06\":15.0,\"int07\":2.0,\"int08\":181.0,\"int09\":1.0,\"int10\":2.0,\"int11\":null,\"int12\":2.0,\"cat00\":\"68fd1e64\",\"cat01\":\"80e26c9b\",\"cat02\":\"fb936136\",\"cat03\":\"7b4723c4\",\"cat04\":\"25c83c98\",\"cat05\":\"7e0ccccf\",\"cat06\":\"de7995b8\",\"cat07\":\"1f89b562\",\"cat08\":\"a73ee510\",\"cat09\":\"a8cd5504\",\"cat10\":\"b2cb9c98\",\"cat11\":\"37c9c164\",\"cat12\":\"2824a5f6\",\"cat13\":\"1adce6ef\",\"cat14\":\"8ba8b39a\",\"cat15\":\"891b62e7\",\"cat16\":\"e5ba7672\",\"cat17\":\"f54016b9\",\"cat18\":\"21ddcdc9\",\"cat19\":\"b1252a9d\",\"cat20\":\"07b5194c\",\"cat21\":null,\"cat22\":\"3a171ecb\",\"cat23\":\"c5c50484\",\"cat24\":\"e8b83407\",\"cat25\":\"9727dd16\"}\n"
     ]
    }
   ],
   "source": [
    "data = df.iloc[0, :].to_json()\n",
    "print(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'{\"result\": 0.35952275816753043}'"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Ensure the aks service is running and provides expected results\n",
    "aks_service.run(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\"result\": 0.35952275816753043}\n"
     ]
    }
   ],
   "source": [
    "# Make sure an HTTP request to the service will also work\n",
    "response = requests.post(url=url, json=data, headers=headers)\n",
    "print(response.json())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup LocustFile\n",
    "\n",
    "Locust uses a locust file (defaulting to locustfile.py) which controls the user behavior. \n",
    "\n",
    "In this example we create a UserBehavior class which encapsulates the tasks that the user will conduct each time it is started. We are only interested in ensure the service can handle a request with sample data so the only task used is the score task which is a simple post request like what was done manually above.\n",
    "\n",
    "The next class defines how a user will be instantiated, in this case we create a user which will make start an http session with the host server and execute the defined tasks. The task will be repeated after waiting for a small period of time. That wait period is determined by making a uniform random sample between the min and max wait times (in milliseconds)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "locustfile = \"\"\"\n",
    "from locust import HttpLocust, TaskSet, task\n",
    "\n",
    "\n",
    "class UserBehavior(TaskSet):\n",
    "    @task\n",
    "    def score(self):\n",
    "        self.client.post(\"{score_url}\", json='{data}', headers={headers})\n",
    "\n",
    "\n",
    "class WebsiteUser(HttpLocust):\n",
    "    task_set = UserBehavior\n",
    "    # min and max time to wait before repeating task\n",
    "    min_wait = 1000\n",
    "    max_wait = 2000\n",
    "\"\"\".format(data=data, headers=headers, score_url=parsed_url.path)\n",
    "\n",
    "locustfile_path = os.path.join(TMP_DIR.name, 'locustfile.py')\n",
    "with open(locustfile_path, 'w') as f:\n",
    "    f.write(locustfile)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The next step is to start the locust load test tool. It can be run with a web interface or directly from the command line. In this case we will just run it from the command line and specify the number of concurrent users, how fast the users should spawn and how long the test should run for. All these options can be controlled via the web interface gui as well as providing more information on failures so it is useful to read the documentation for more advanced usage. Here we will just run the test and capture the summary results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2019-05-28 12:36:31,630] 9821192-1116/INFO/locust.main: Run time limit set to 60 seconds\n",
      "[2019-05-28 12:36:31,631] 9821192-1116/INFO/locust.main: Starting Locust 0.11.0\n",
      "[2019-05-28 12:36:31,631] 9821192-1116/INFO/locust.runners: Hatching and swarming 200 clients at the rate 10 clients/s...\n",
      "[2019-05-28 12:36:51,864] 9821192-1116/INFO/locust.runners: All locusts hatched: WebsiteUser: 200\n",
      "[2019-05-28 12:37:30,701] 9821192-1116/INFO/locust.main: Time limit reached. Stopping Locust.\n",
      "[2019-05-28 12:37:30,707] 9821192-1116/INFO/locust.main: Shutting down (exit code 0), bye.\n",
      "[2019-05-28 12:37:30,707] 9821192-1116/INFO/locust.main: Cleaning up runner...\n",
      "[2019-05-28 12:37:30,738] 9821192-1116/INFO/locust.main: Running teardowns...\n",
      " Name                                                          # reqs      # fails     Avg     Min     Max  |  Median   req/s\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      " POST /api/v1/service/lightgbm-criteo/score                      5298     0(0.00%)     364      34     927  |     390  104.30\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      " Total                                                           5298     0(0.00%)                                     104.30\n",
      "\n",
      "Percentage of the requests completed within given times\n",
      " Name                                                           # reqs    50%    66%    75%    80%    90%    95%    98%    99%   100%\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      " POST /api/v1/service/lightgbm-criteo/score                       5298    390    420    440    460    500    530    590    640    930\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      " Total                                                            5298    390    420    440    460    500    530    590    640    930\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "cmd = \"locust -H {host} -f {path} --no-web -c {users} -r {rate} -t {duration} --only-summary\".format(\n",
    "    host='{url.scheme}://{url.netloc}'.format(url=parsed_url),\n",
    "    path=locustfile_path,\n",
    "    users=200,  # concurrent users\n",
    "    rate=10,  # hatch rate (users / second)\n",
    "    duration='1m',  # test duration\n",
    ")\n",
    "process = subprocess.run(cmd, shell=True, stderr=subprocess.PIPE)\n",
    "print(process.stderr.decode('utf-8'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load Test Results\n",
    "\n",
    "Above you can see the number of requests, failures and statistics on response time, as well as the number of requests per second that the server is handling.\n",
    "\n",
    "The second line shows the distribution of response times which can be helpful to understand over all the requests how the load is impacting the response speed and whether there may be outliers which are impacting performance."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Cleanup temporary directory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "TMP_DIR.cleanup()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "authors": [
   {
    "name": "pasha"
   }
  ],
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  },
  "name": "deploy-to-aci-04",
  "notebookId": 904892461294324
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
